Discovering Sparse Covariance Structures with the Isomap
نویسنده
چکیده
Regularization of covariance matrices in high dimensions is usually either based on a known ordering of variables or ignores the ordering entirely. This paper proposes a method for discovering meaningful orderings of variables based on their correlations using the Isomap, a non-linear dimension reduction technique designed for manifold embeddings. These orderings are then used to construct a sparse covariance estimator, which is block-diagonal and/or banded. Finding an ordering to which banding can be applied is desirable because banded estimators have been shown to be consistent in high dimensions. We show that in situations where the variables do have such a structure, the Isomap does very well at discovering it, and the resulting regularized estimator performs better for covariance estimation than other regularization methods that ignore variable order, such as thresholding. We also propose a bootstrap approach to constructing the neighborhood graph used by the Isomap, and show it leads to better estimation. We illustrate our method on data on protein consumption, where the variables (food types) have a structure but it cannot be easily described a priori, and on a gene expression data set.
منابع مشابه
Non-linear Dimensionality Reduction by Locally Linear Isomaps
Algorithms for nonlinear dimensionality reduction (NLDR) find meaningful hidden low-dimensional structures in a high-dimensional space. Current algorithms for NLDR are Isomaps, Local Linear Embedding and Laplacian Eigenmaps. Isomaps are able to reliably recover lowdimensional nonlinear structures in high-dimensional data sets, but suffer from the problem of short-circuiting, which occurs when t...
متن کاملGraph approximations to geodesics on embedded manifolds
0 Introduction In [1] Tenenbaum, de Silva and Langford consider the problem of non-linear dimensionality reduction: discovering intrinsically low-dimensional structures embedded in high-dimensional data sets. They describe an algorithm, called Isomap, and demonstrate its successful application to several real and synthetic data sets. In this paper, we discuss some of the theoretical claims for ...
متن کاملGlobal Versus Local Methods in Nonlinear Dimensionality Reduction
Recently proposed algorithms for nonlinear dimensionality reduction fall broadly into two categories which have different advantages and disadvantages: global (Isomap [1]), and local (Locally Linear Embedding [2], Laplacian Eigenmaps [3]). We present two variants of Isomap which combine the advantages of the global approach with what have previously been exclusive advantages of local methods: c...
متن کاملAn inexact interior point method for L 1-regularized sparse covariance selection
Sparse covariance selection problems can be formulated as log-determinant (log-det ) semidefinite programming (SDP) problems with large numbers of linear constraints. Standard primal-dual interior-point methods that are based on solving the Schur complement equation would encounter severe computational bottlenecks if they are applied to solve these SDPs. In this paper, we consider a customized ...
متن کاملLearning Multiple Tasks with a Sparse Matrix-Normal Penalty
In this paper, we propose a matrix-variate normal penalty with sparse inverse covariances to couple multiple tasks. Learning multiple (parametric) models can be viewed as estimating a matrix of parameters, where rows and columns of the matrix correspond to tasks and features, respectively. Following the matrix-variate normal density, we design a penalty that decomposes the full covariance of ma...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008